Building Effective Agents

Simple, Composable Patterns for LLM Systems

Over the past year, we've worked with dozens of teams building large language model (LLM) agents across industries. Consistently, the most successful implementations weren't using complex frameworks or specialized libraries. Instead, they were building with simple, composable patterns.

In this presentation, we share what we've learned from working with customers and building agents ourselves, and give practical advice for developers on building effective agents.

What are agents?

"Agent" can be defined in several ways. At Anthropic, we categorize all these variations as agentic systems, but draw an important architectural distinction:

Workflows

Systems where LLMs and tools are orchestrated through predefined code paths.

Agents

Systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.

Below, we will explore both types of agentic systems in detail.

When (and when not) to use agents

When building applications with LLMs, we recommend finding the simplest solution possible, and only increasing complexity when needed. This might mean not building agentic systems at all.

Considerations

When and how to use frameworks

There are many frameworks that make agentic systems easier to implement, including:

Our recommendation

Building block: The augmented LLM

The basic building block of agentic systems is an LLM enhanced with augmentations such as retrieval, tools, and memory. Our current models can actively use these capabilities—generating their own search queries, selecting appropriate tools, and determining what information to retain.

The augmented LLM

We recommend focusing on two key aspects: tailoring these capabilities to your specific use case and ensuring they provide an easy, well-documented interface for your LLM.

Workflow: Prompt chaining

Prompt chaining decomposes a task into a sequence of steps, where each LLM call processes the output of the previous one. You can add programmatic checks on any intermediate steps to ensure that the process is still on track.

The prompt chaining workflow

When to use this workflow

Ideal for situations where the task can be easily and cleanly decomposed into fixed subtasks. The main goal is to trade off latency for higher accuracy, by making each LLM call an easier task.

Workflow: Routing

Routing classifies an input and directs it to a specialized followup task. This workflow allows for separation of concerns, and building more specialized prompts.

The routing workflow

When to use this workflow

Works well for complex tasks where there are distinct categories that are better handled separately, and where classification can be handled accurately.

Examples

Workflow: Parallelization

LLMs can sometimes work simultaneously on a task and have their outputs aggregated programmatically. This workflow manifests in two key variations:

Sectioning

Breaking a task into independent subtasks run in parallel.

Voting

Running the same task multiple times to get diverse outputs.

The parallelization workflow

When to use this workflow

Effective when subtasks can be parallelized for speed, or when multiple perspectives are needed for higher confidence results.

Workflow: Orchestrator-workers

In the orchestrator-workers workflow, a central LLM dynamically breaks down tasks, delegates them to worker LLMs, and synthesizes their results.

The orchestrator-workers workflow

When to use this workflow

Well-suited for complex tasks where you can't predict the subtasks needed. The key difference from parallelization is its flexibility—subtasks aren't pre-defined, but determined by the orchestrator.

Example

Coding products that make complex changes to multiple files each time.

Workflow: Evaluator-optimizer

In the evaluator-optimizer workflow, one LLM call generates a response while another provides evaluation and feedback in a loop.

The evaluator-optimizer workflow

When to use this workflow

Particularly effective when we have clear evaluation criteria, and when iterative refinement provides measurable value. This is analogous to the iterative writing process a human writer might go through.

Examples

Agents

Agents are emerging in production as LLMs mature in key capabilities—understanding complex inputs, engaging in reasoning and planning, using tools reliably, and recovering from errors.

Autonomous agent

When to use agents

Note: The autonomous nature of agents means higher costs, and the potential for compounding errors. We recommend extensive testing in sandboxed environments.

Combining and customizing these patterns

These building blocks aren't prescriptive. They're common patterns that developers can shape and combine to fit different use cases.

Key to success

As with any LLM features, measuring performance and iterating on implementations is crucial. You should consider adding complexity only when it demonstrably improves outcomes.

High-level flow of a coding agent

Frameworks can help you get started quickly, but don't hesitate to reduce abstraction layers and build with basic components as you move to production.

Summary

Success in the LLM space isn't about building the most sophisticated system. It's about building the right system for your needs.

Our approach

  1. Start with simple prompts
  2. Optimize them with comprehensive evaluation
  3. Add multi-step agentic systems only when simpler solutions fall short

Three core principles for implementing agents

  1. Maintain simplicity in your agent's design
  2. Prioritize transparency by explicitly showing the agent's planning steps
  3. Carefully craft your agent-computer interface (ACI) through thorough tool documentation and testing

Appendix 1: Agents in practice

Our work with customers has revealed two particularly promising applications for AI agents that demonstrate the practical value of the patterns discussed above.

Customer support

A natural fit for more open-ended agents because:

  • Support interactions naturally follow a conversation flow
  • Tools can pull customer data and order history
  • Actions like issuing refunds can be handled programmatically
  • Success can be clearly measured through user-defined resolutions

Coding agents

Particularly effective because:

  • Code solutions are verifiable through automated tests
  • Agents can iterate using test results as feedback
  • The problem space is well-defined and structured
  • Output quality can be measured objectively

Appendix 2: Prompt engineering your tools

No matter which agentic system you're building, tools will likely be an important part of your agent. Tool definitions and specifications should be given just as much prompt engineering attention as your overall prompts.

Our suggestions for deciding on tool formats

Creating good agent-computer interfaces (ACI)

15 / 15